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ABSTRACT 

The emerging Middle East respiratory syndrome coronavirus (MERS-CoV) causes lethal respiratory infections mainly on the 
Arabian Peninsula. The evolutionary origins of MERS-CoV are unknown. We determined the full genome sequence of a CoV 
directly from fecal material obtained from a South African Neoromicia capensis bat (NeoCoV). NeoCoV shared essential details 
of genome architecture with MERS-CoV. Eighty-five percent of the NeoCoV genome was identical to MERS-CoV at the nucleo- 
tide level. Based on taxonomic criteria, NeoCoV and MERS-CoV belonged to one viral species. The presence of a genetically di- 
vergent S1 subunit within the NeoCoV spike gene indicated that intraspike recombination events may have been involved in the 
emergence of MERS-CoV. NeoCoV constitutes a sister taxon of MERS-CoV, placing the MERS-CoV root between a recently de- 
scribed virus from African camels and all other viruses. This suggests a higher level of viral diversity in camels than in humans. 
Together with serologic evidence for widespread MERS-CoV infection in camelids sampled up to 20 years ago in Africa and the 
Arabian Peninsula, the genetic data indicate that camels act as sources of virus for humans rather than vice versa. The majority 
of camels on the Arabian Peninsula is imported from the Greater Horn of Africa, where several Neoromicia species occur. The 
acquisition of MERS-CoV by camels from bats might have taken place in sub-Saharan Africa. Camelids may represent mixing 
vessels for MERS-CoV and other mammalian CoVs. 


IMPORTANCE 

It is unclear how, when, and where the highly pathogenic MERS-CoV emerged. We characterized the full genome of an African 
bat virus closely related to MERS-CoV and show that human, camel, and bat viruses belong to the same viral species. The bat 
virus roots the phylogenetic tree of MERS-CoV, providing evidence for an evolution of MERS-CoV in camels that preceded that 
in humans. The revised tree suggests that humans are infected by camels rather than vice versa. Although MERS-CoV cases occur 
mainly on the Arabian Peninsula, the data from this study together with serologic and molecular investigations of African cam- 
els indicate that the initial host switch from bats may have taken place in Africa. The emergence of MERS-CoV likely involved 
exchanges of genetic elements between different viral ancestors. These exchanges may have taken place either in bat ancestors or 
in camels acting as mixing vessels for viruses from different hosts. 


i es coronaviruses (HCoVs) belong to the genera Alphac- _ genetically highly related to those from human cases, these ani- 
oronavirus and Betacoronavirus within the subfamily Corona- mals are considered to constitute the source of human infections 


virinae. Betacoronaviruses are further divided into four genetic (7-9). High rates of antibodies against MERS-CoV were recently 


clades, termed clades a to d (1). The genetic diversity of CoVs in found in African camels, and a MERS-CoV strain was detected in 
bats exceeds that known for any other host, which is compatible an Egyptian camel likely imported from Sudan (10-12). 
with bats being the major reservoir of mammalian CoVs (2). MERS-CoV belongs to Betacoronavirus clade c (13, 14). The 
In 2002 to 2003, an emerging HCoV, termed severe acute re- 
spiratory syndrome coronavirus (SARS-CoV), caused a pandemic 
involving about 8,000 cases, about 10% of whom died. SARS-CoV 
belonged to Betacoronavirus clade b. The evolutionary origins of 
SARS-CoV involved bat hosts, possibly with civets as intermediate 
hosts and the source of human infection (3, 4). 
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cause camels on the Arabian Peninsula show high rates of neutral- doi:10.1128/JVL01498-14 


izing antibodies against MERS-CoV and harbor viruses that are 
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prototype clade c betacoronaviruses, termed HKU4 and HKU5, 
were detected in bats (15). HKU4 and HKU5 form two separate 
species in genetic sister relationship to MERS-CoV (13). Distinct 
clade c betacoronaviruses putatively representing another clade c 
Betacoronavirus species (Erinaceus CoV [EriCoV]) were recently 
described in hedgehogs (16). We and others characterized small 
genomic sequence fragments of bat CoVs (BtCoVs) that were 
closely related to MERS-CoV and suggested that MERS-CoV an- 
cestors may have evolved in bats (17-19). Because these sequence 
fragments encompassed only a few hundred nucleotides from a 
single gene, the RNA-dependent RNA polymerase (RdRp) gene, the 
evolutionary relationship of these bat CoVs with MERS-CoV 
could not be conclusively defined (20, 21). A bat virus most likely 
representing a potential MERS-CoV ancestor was detected in a 
Neoromicia capensis bat from South Africa (22). 

In this study, we characterized the full genome of the Neoromi- 
cia bat CoV, referred to here as NeoCoV. We determined that 
NeoCoV belongs to the same viral species defined by MERS-CoV 
based upon established taxonomic criteria (1, 21). Analysis of the 
NeoCoV genome pointed toward nonrecent recombination 
events within the MERS-CoV species. The bat virus roots the phy- 
logenetic tree of MERS-CoV and shows that MERS-CoV evolu- 
tion in camelids likely preceded that in humans. 


MATERIALS AND METHODS 


Sample processing and full-genome sequencing. A fecal specimen from 
a Neoromicia bat from South Africa was sampled and tested positive for 
CoVs, as described previously (22). To obtain the full sequence of the 
NeoCoV genome, 70 heminested reverse transcription-PCR (RT-PCR) 
assays were developed (primer sequences and PCR conditions are listed in 
Table S1 in the supplemental material). These assays were designed to 
amplify about 800 overlapping base pairs of all known MERS-CoV se- 
quences. 

Genomic fragments that could not be amplified by these assays were 
connected by bridging RT-PCR using NeoCoV-specific primers (available 
upon request) and sequenced by dye terminator chemistry. Determina- 
tion of genome ends was done by using a rapid amplification of cDNA 
ends kit (Roche, Penzberg, Germany). 

Genomic analyses. Nucleotide and amino acid sequences of predicted 
open reading frames (ORFs) and the full genome of NeoCoV and related 
betacoronaviruses were aligned by using MAFFT (23). The pairwise iden- 
tities of the genome and all ORFs and predicted proteins of NeoCoV were 
calculated by using MEGAS (24). Similarity plots of CoV clade c genomes 
were generated by using SSE v1.1 (25), using a sliding window of 400 and 
a step size of 40 nucleotides (nt). Phylogenetic analyses of predicted ORFs 
were done by using MrBayes v3.1 (26), using a GTR+G+I nucleotide or 
a WAG amino acid substitution model and 2,000,000 generations sam- 
pled every 100 steps. Trees were annotated by using the latter 75% of all 
trees in TreeAnnotator v1.5 and visualized with FigTree v1.4 from the 
BEAST package (27). 


RESULTS AND DISCUSSION 


The NeoCoV-positive bat was identified as a female Neoromicia 
capensis (shown in Fig. 1) based on size (forearm length of 35 mm 
and mass of 5.5 g) and dental and cranial characteristics (28). 
Typing was confirmed by characterization of the cytochrome b 
and cytochrome oxidase I (COI) genes (GenBank accession num- 
bers KJ756000 and KJ756001), allowing definite species identifi- 
cation as N. capensis. 

The full NeoCoV genome sequence was obtained directly from 
fecal material stored in RNAlater by using the panel of PCR assays 
developed for this study as well as NeoCoV-specific primers. This 
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FIG 1 Neoromicia capensis bat. The absence of a tiny upper premolar separates 
it from similarly sized Pipistrellus and Hypsugo bats. The presence of an occip- 
ital helmet separates it from Neoromicia zuluensis, the species to which it was 
assigned based on preliminary morphological criteria. 


suggests an applicability of this panel for characterizations of ge- 
netically diversified MERS-CoVs in further studies. Figure 2A 
shows a graphical representation of the NeoCoV genome (Gen- 
Bank accession number KC869678). This genome contained 
30,100 nt excluding the poly(A) tail, with a G/C content of 40%. 
This was comparable to MERS-CoV strains, which range in size 
from 30,100 to 30,107 nt and have a G/C content of 41%. The 
number and order of NeoCoV open reading frames (ORFs) were 
identical to those of MERS-CoV in the order ORFlab-spike- 
ORF3-ORF4ab-ORF5-envelope (E)-membrane (M)-nucleocapsid 
(N)-ORF8b. As in MERS-CoV, a ribosomal frameshift site and 16 
nonstructural protein (NSP) domains within ORF lab were pre- 
dicted. Table 1 provides information on the size and genomic 
location of these NSP domains. 

Figure 2A and Table 2 provide details on the predicted ORFs, 
transcription regulatory sequences (TRSs), and their genomic lo- 
calizations. In analogy to MERS-CoV, eight putative TRSs with 
the conserved TRS core motif of clade c betacoronaviruses, AAC 
GAA, preceded predicted ORFs. The predicted leader TRS core 
sequence of NeoCoV (TTAACGAACT) and the predicted body 
TRS core sequences of NeoCoV were completely identical to those 
of MERS-CoVs. This included TRS core sequences preceding the 
E (AAAACGAACT) and N (TTAACGAATC) genes showing mi- 
nor sequence differences, as observed previously for MERS-CoV 
(13). As in MERS-CoV, no separate body TRSs preceding the pre- 
dicted AUG codons of ORF4b and ORF8b were detected. 

Amino acid sequence identity in seven concatenated NSP do- 
mains has been established by the International Committee on 
Taxonomy of Viruses (ICTV) for CoV species demarcation (1, 
21). As shown in Table 3, the amino acid sequence identity of these 
translated domains of NeoCoV was 97.2 to 97.4% compared to 
MERS-CoV strains. Because this exceeded the 90% threshold de- 
fined to separate CoV species (1), NeoCoV and MERS-CoV be- 
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FIG 2 Genome organization of NeoCoV and sequence identity compared to 
other clade c betacoronaviruses. (A) Genome organization of NeoCoV. The 
NeoCoV genome is represented by a black line; ORFs are indicated by gray 
arrows. The ribosomal frameshift site (RFS) is marked with an arrowhead. The 
locations of transcription regulatory core sequences (TRSs) following the 
leader (L) are marked by labeled dots and numbered in their order of appear- 
ance from the genomic 5’ terminus. (B) Genomic sequence identity between 
NeoCoV and other clade c betacoronaviruses. Plots were generated by using 
SSE version 1.1 (25). The graph representing the comparison of the phyloge- 
netically basal camel virus NRCE-HKU205 and NeoCoV is not shown due toa 
total overlap in the curve resulting from the comparison between NeoCoV and 
human MERS-CoV. 


longed to the same species. The established Betacoronavirus clade 
c species HKU4 and HKU5 and hedgehog CoV shared 85.3 to 
88.7% amino acid sequence identity with MERS-CoVs from hu- 
mans and camels as well as NeoCoV, substantiating the classifica- 
tion of MERS-CoV as a separate species. 

To analyze the relationships of NeoCoV with MERS-CoV and 
other clade c betacoronaviruses beyond the domains used for spe- 
cies delineation, full-genome comparisons were made. NeoCoV 
shared 85.5% to 85.6% overall nucleotide identity with MERS- 
CoVs from humans and camels. Nucleotide identity with other 
clade c betacoronaviruses was considerably lower, at 25.5 to 
51.5%. Figure 2B shows that the nucleotide identity between 
NeoCoV and MERS-CoV decreased in the genomic region encod- 
ing the spike glycoprotein. Sequence identity was lower toward the 
5’ end of the spike ORF than toward its 3’ end. The translated spike 
ORF of NeoCoV showed 64.3 to 64.6% identity with MERS-CoV 
and 60.5 to 63.6% identity with other clade c betacoronaviruses. 
Table 2 shows that the amino acid sequence identities between 
NeoCoV and MERS-CoV were higher for all other ORFs. The 
genes encoding the structural proteins E, M, and N showed high 
levels of sequence identity between NeoCoV and MERS-CoV 
strains, up to 89.0% (E), 94.5% (M) and 91.7% (N). The functions 
of MERS-CoV ORF3, ORF4ab, ORF5, and ORF8b are poorly un- 
derstood. Sequence identities between these ORFs from NeoCoV 
and those from MERS-CoV ranged between 76.5% (ORF3) and 
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88.4% (ORF5). These identity levels exceed those between any 
other clade c betacoronavirus and MERS-CoV by 2- to 3-fold. 
This includes ORF4a, which acts as an interferon antagonist, pre- 
sumably by interaction with double-stranded RNA in MERS-CoV 
(29). The 23 amino acid (aa) positions suggested to form a double- 
stranded RNA-binding domain in the MERS-CoV 4a protein (29) 
were completely conserved in NeoCoV, suggesting that the pre- 
dicted NeoCoV protein might have a homologous function. 

To confirm the genetic relationships suggested by sequence 
distance comparisons, Bayesian phylogenies of major ORFs were 
reconstructed. As shown in Fig. 3A, NeoCoV clustered with a basal 
sister relationship to a clade containing MERS-CoV from humans 
and camels in all ORFs except the spike ORF. In the spike ORF, 
NeoCoV clustered with European hedgehog CoVs and a Nycteris 
bat CoV from Ghana. To find reasons for the variant tree topology 
in the spike ORF, Bayesian phylogenetic reconstructions and se- 
quence distance analyses were done on two different spike ORF 
data sets. One data set represented subunit S1, which contains the 
receptor-binding domain (RBD). Another data set represented 
subunit 82, involved in the fusion of viral and cellular membranes. 
Figure 3B shows that in the phylogenetic tree of the S1 subunit, 
NeoCoV clustered distantly from MERS-CoV at the same position 
as that observed in the tree based on the full spike sequence. Ac- 
cordingly, MERS-CoV and NeoCoV showed only 46.0% amino 
acid sequence identity in the S1 subunit. In the $2 subunit, Neo- 
CoV shared a monophyletic origin with MERS-CoV in a basal 
sister relationship to MERS-CoVs from humans and camels, sim- 
ilar to the phylogenetic reconstructions in all other ORFs. Mono- 
phyly correlated with a higher degree of amino acid sequence 


TABLE 1 Prediction of the putative pp1la/pp1ab cleavage sites of 
NeoCoV based on sequence comparison with MERS-CoV strain EMC/ 
2012 


Protein size 


lst amino acid (no. of 
residue-last amino amino Putative functional 
NSP acid residue* acids) domain(s)?’ 
NSP1 Met!-Gly'”? 193 
NSP2 Asp"*4-Gly*°” 664 
NSP3 Ala®°8-Gly?”” 1,886 ADRP, PL2pro 
NSP4 Ala??”44-Gln??°° 507 
NSP5 Ser*?°!-GIn??°° 306 3CLpro 
NSP6 Ser?°°7-GIn?88 292 
NSP7 Ser*®4?-Gln?93+ 83 
NSP8 Ala®???-Gln*!°? 198 Primase 
NSP9 Asn‘*!3!_GIn#*4° 110 
NSP10 Ala*?4!_Gin*980 140 
NSP11 Ser*8!-Leu*?™* 14 Short peptide at the 
end of ORFla 
NSP 12 Ser*?81-GIn??4 933 RdRp 
NSP13 Ala®?!°-GIn??” 598 HEL, NTPase 
NSP14 Ser??!?-Gin° 524 ExoN, NMT 
NSP15 Gly®*??-GIn®”7? 343 NendoU 
NSP16 Ala®7®°-Arg?°8? 303 OMT 


* Superscript numbers indicate positions in polyprotein ppla/pp1ab or positions in the 
available sequence with the supposition of a ribosomal frameshift based on the 
conserved slippery sequence (UUUAAAC) of coronaviruses. 

» ADRP, ADP-ribose 1-phosphatase; PL2pro, papain-like protease 2; 3CLpro, 
coronavirus NSP5 protease; Hel, helicase; NTPase, nucleoside triphosphatase; ExoN, 
exoribonuclease; NMT, N7 methyltransferase; NendoU, endoribonuclease; OMT, 
2'-O-methyltransferase. 
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TABLE 2 Coding potential, putative transcription regulatory sequences, and sequence comparison with prototype clade c betacoronaviruses 


% amino acid identity’ of BtCoV/Neoromicia/PML-PHE1/ 
RSA/2011 with: 


nt positions No. of 

ORF (start-end) amino acids Sequence* MERS-CoV* HkKuU4? HKUS5* EriCoV! 
ORF lab 281-21528 7,082 00057 GATCTTAACGAACTTAAA,, 92.7 73.7 76.0 73.9 
Spike 21470-25504 1,344 o1419 CAGATTAACGAACTTGTA, 1 499 64.3-64.6 60.5-60.8 61.5 63.6 
ORF3 25519-25830 103 95501 CTAATTAACGAACTTCCA 551 76.5-78.4 40.7-44.0 47.9 27.3-28.4 
ORF4a 25839-26168 109 95q03T TAATTAACGAACTCTAT 5540 87.0-88.0 37.4-38.3 41.9-42.9 40.7 
ORF4b 26044-26820 258 83.7-85.4 28.6-29.4 26.8-26.8 39.0 
ORF5 26827-27501 224 96813 GATTTTAACGAACTATGG 639 87.1-88.4 47.1 54.8-55.7 513 

E 27577-27825 82 37563ATGGAAAACGAACTATGT yas 89.0 73.2-74.4 72.0 76.8 

M 27840-28499 219 a7g1gGGGTTTAACGAACTCCTT 37835 93.6-94.5 81.7-82.2 82.6-83.1 80.7-81.2 
N 28557-29801 414 agsog GATCTTAACGAATCTTAA ygs45 91.3-91.7 76.5-76.8 76.0 72.7-73.0 
ORF8b 28603-29202 199 81.1-83.9 48.7-50.8 52.6-55.8 45.4—46.4 


* Underlined type indicates conserved nucleotides of the putative leader TRS core sequence. Subscripted numbers indicate positions in the BtCoV/Neoromicia/PML-PHE1/RSA/ 


2011 genome. 
» Calculated with MEGAS (24) using a pairwise deletion option. 


© GenBank accession numbers JX869059, KC164505, KC776174, KF186567, KF192507, KF600612, KF600620, and KJ477102. 


4 GenBank accession numbers EF065506, EF065507, EF065508, and DQ648794. 


© GenBank accession numbers EF065505, EF065509, EF065510, EF065511, and EF065512. 


f GenBank accession numbers KC545383 and KC545386. 


identity (87.2%) between MERS-CoV and NeoCoV in the S2 sub- 
unit. 

These data suggested that the human-pathogenic MERS-CoV 
variant might be the result of nonrecent recombination events 
involving as-yet-unknown partners. Typical recombination 
breakpoints in CoV genomes encompass the spike gene (2). In- 
traspike recombination between the S1 and S2 subunits has been 
hypothesized to be the major mechanism involved in the emer- 
gence of SARS-CoV from bat and civet ancestors (30). The loca- 
tion of RBDs at either the N or the C termini of the S1 subunits of 
HCoV-229E and mouse hepatitis virus (MHV) has been inter- 
preted as further evidence for the interchangeability of spike sub- 
units (30). 

The different S1 subunit suggested that NeoCoV was not the 
direct ancestor of MERS-CoV. Of note, CoVs are mostly associ- 
ated with chiropteran hosts on the genus level (2), and MERS- 
CoV was shown to infect cells from vespertilionid bats (31). Ac- 


TABLE 3 Comparison of amino acid identities of seven conserved 
replicase domains of NeoCoV for species classification 


Ae aN % amino acid identity with*: 

domain MERS-CoV’? HKU4° = HKU5* __EriCoV* 
ADRP 90.6 59.1-60.4 63.1 67.5 
NSP5 (3CLpro) 96.4—97.1 80.4-80.7 82.4-83.0 78.4-79.1 
NSP12 (RdRp) 98.6-98.7 89.9-90.0 92.5 89.1-89.5 
NSP 13 (Hel, NTPase) 98.5—98.7 91.8-92.3 94.5 91.0-91.1 
NSP14 (ExoN, NMT) 97.9-98.3 85.7-86.6 91.8-92.0 89.1-89.5 
NSP15 (NendoU) 93.6—-94.5 77.2-77.8 81.9-82.2 81.9-82.5 
NSP16 (OMT) 96.4 83.4 87.1 86.8-87.8 
Concatenated domains 97.2-97.4 85.3-85.4 88.7 86.5 


* Calculated with MEGAS (24) using a pairwise deletion option. 

B Including GenBank accession numbers JX869059, KC164505, KC776174, KF186567, 
KF192507, KF600612, KF600620, and KJ477102. 

© Including GenBank accession numbers EF065506, EF065507, EF065508, and 
DQ648794. 

4 Including GenBank accession numbers EF065505, EF065509, EF065510, EF065511, 
and EF065512. 

° Including GenBank accession numbers KC545383 and KC545386. 
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cording to this principle, MERS-CoV variants carrying a spike 
gene closely related to human-pathogenic MERS-CoV may exist 
in bats belonging to the family Vespertilionidae and specifically the 
genus Neoromicia. This scenario parallels the spike gene diversity 
found in bat CoV ancestors of SARS-CoV. All ancestral bat SARS- 
related CoVs described since 2005 have had highly diversified 
spike genes, which differed from human SARS-CoV by about 20% 
of their amino acid sequences (4). Only recently, a bat virus car- 
rying a spike gene related to human SARS-CoV and capable of 
using the SARS-CoV receptor molecule ACE2 was found (3). In 
agreement with the principle that chiropteran hosts can harbor 
closely related CoVs, the bat CoV carrying the human SARS-CoV- 
related spike gene occurred in Rhinolophus sinicus, which is the 
same bat species yielding multiple SARS-related CoV lineages car- 
rying divergent spike genes since 2005 (2). 

Because NeoCoV clustered with a basal sister relationship to 
MERS-CoV in all ORFs and because NeoCoV and MERS-CoV 
belonged to one virus species, the bat virus can be used to infer the 
root of the phylogenetic tree of MERS-CoV. Figure 3C shows a 
Bayesian phylogenetic reconstruction of all available MERS-CoV 
full genomes from camels and representative MERS-CoV full ge- 
nomes from humans, rooted by NeoCoV. The MERS-CoV de- 
tected in an African camel, termed NRCE-HKU205 (10), clus- 
tered with high statistical support in basal sister relationship to 
MERS-CoVs from humans and camels from the Arabian Penin- 
sula. Despite the phylogenetic clustering of NRCE-HKU205 in an 
intermediate position between NeoCoV and MERS-CoVs from 
the Arabian Peninsula, the maximum nucleotide distance of 
NeoCoV and NRCE-HKU205 did not differ from the maximum 
nucleotide distance of NeoCoV and the other MERS-CoVs, at 
14.5% and 14.4 to 14.5%, respectively. The maximum nucleotide 
distance within MERS-CoVs from camels was 0.6%. This was 
slightly higher than the 0.4% maximum distance within MERS- 
CoVs from humans, although only 11 near-full-length genomes 
are available for MERS-CoVs from camels, compared to 38 near- 
full-length genomes for MERS-CoVs from humans. 

The absence of more diversified MERS-CoV sequences in hu- 
mans, particularly outside the Arabian Peninsula, cannot be fully 
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excluded. However, the phylogenetic position of the outlier camel 
MERS-CoV and the slightly higher level of genetic diversity in 
camels suggest that the evolution of MERS-CoV in camels pre- 
ceded that in humans and that camels represent donors of viruses 
for humans rather than vice versa. These genetic data are corrob- 
orated by the existence of specific antibodies against MERS-CoV 
in up to 90% of camelids from the Arabian Peninsula and Africa 
for at least 20 years (7, 9, 11, 12). On the contrary, sera from 
children with respiratory disease sampled in 2010 to 2011 in Saudi 
Arabia (32) and sera from blood donors and slaughterhouse 
workers sampled in Saudi Arabia in 2012 (33) showed no evidence 
of antibodies against MERS-CoV. 

Interestingly, the camel yielding the genetic outlier MERS- 
CoV was likely imported from Sudan into Egypt (10). The major- 
ity of camels in the Arabian Peninsula are imported from coun- 
tries in the Greater Horn of Africa, such as Somalia, Sudan, and 
Kenya (10, 11). We and others have recently shown that MERS- 
CoV-neutralizing antibodies occur frequently in camels from 
eastern Africa (11, 12). A hypothetical scenario might thus imply a 
spillover of viruses from bats to camels in the Greater Horn of 
Africa. Of note, 10 out of 11 Neoromicia species listed by the In- 
ternational Union for Conservation of Nature (http://www 
.iucnredlist.org/), including N. capensis, are common in this re- 
gion. An alternative scenario implying an exchange of viruses 
between Neoromicia bats and camels on the Arabian Peninsula is 
unlikely, because Neoromicia bats are not known to occur in this 
region. 

The deep branches leading to NeoCoV in phylogenetic recon- 
structions and the differences observed in the spike gene of this bat 
virus suggest that this putative host switch may have occurred 
nonrecently. Although NeoCoV formed one viral species together 
with MERS-CoVs from humans and camels, unidirectional host 
switching events from bat hosts to other mammalian hosts such as 
camels and different evolutionary histories in these hosts can be 
assumed. The bat, camel, and human hosts of this CoV species are 
thus unlikely to fulfill the population criteria required for coales- 
cent dating. Additionally, dating of CoV branches deeper than 


FIG 3 Bayesian phylogenies of clade c betacoronaviruses, including NeoCoV. 
(A) Phylogenies of ORF1a, ORF 1b, and ORFs coding for structural proteins. 
(B) Phylogenies of the S1 and S2 subunits, corresponding to amino acid posi- 
tions 1 to 747 and 748 to 1353, respectively, of MERS-CoV strain EMC/2012. 
NeoCoV is shown in red, camel MERS-CoV is shown in blue, and human 
MERS-CoV is shown in cyan. HCoV-OC43 was used as an outgroup. (C) 
Phylogeny of MERS-CoV full genomes. MERS-CoVs obtained from humans 
are shown in black, and MERS-CoVs from camels are shown in blue. NeoCoV 
was used for rooting the tree. For all trees, statistical support of grouping from 
Bayesian posterior probabilities is shown at deep nodes. Only values above 0.7 
are shown. The bar represents genetic distance. GenBank accession numbers 
are KJ477102 for NRCE-HKU205, KJ156881 for Wadi-Ad-Dawasir 1 2013, 
JX869059 for EMC/2012, KJ650296 for KFU-HKU19D, KC776174 for Jordan- 
N3/2012, KJ650297 for KFU-HKUI1, KJ156910 for Hafr-Al-Batin2 2013, 
KF600613 for Riyadh 3 2013, KF186567 for Al-Hasa 1 2013, KC164505 for 
England1, KF961221 for Qatar3, KJ713299 for KSA-CAMEL-376, KJ156949 
for Taifl 2013, KJ556336 for Jeddah1 2013, KJ713297 for KSA-CAMEL-503, 
KJ713295 for KSA-CAMEL-505, KF192507 for Munich 2013, KJ650098 for 
Qatar 2 2014, KF745068 for FRA/UAE, KF600630 for Buraidahl 2013, 
KJ650295 for KFU-HKU13, KF600628 for Hafr-Al-Batin1 2013, KJ713298 for 
KSA-CAMEL-363, KJ713296 for KSA-CAMEL-378, KF600620 for Bishal 
2012, KC869678 for NeoCoV, NC_005147 for HCoV-OC43, EF065512 for 
HKU5-5, NC_009020 for HKU5-1, NC_008315 for BtCoV/133, NC_009019 
for HKU4-1, KC545386 for EriCoV/2012-216, KC545383 for EriCoV/2012- 
174, and KM027259 for Jeddah 2014 C9055. 
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0.05 substitutions per site, such as those leading to the predicted 
ancestor of NeoCoV and MERS-CoV, was found to be highly un- 
reliable and may greatly underestimate the true evolutionary his- 
tory of CoVs (34). Therefore, and because of the evidence for 
recombination detected in the NeoCoV spike gene, no molecular 
dating of the projected ancestor shared by NeoCoV and other 
MERS-CoVs was conducted. 

The putative ancient recombination events giving rise to 
MERS-CoV may have taken place in two candidate hosts to be 
explored. The first and more likely option is vespertilionid bats. It 
would be highly relevant to fully characterize additional bat vi- 
ruses from Africa and the Arabian Peninsula. Alternatively, cam- 
elids may represent a putative mixing vessel, similar to the role of 
swine in influenza A viruses (35). Sera from camelids should be 
tested for antibodies against the NeoCoV S1 subunit to gather 
evidence for infection with this CoV lineage. In parallel, camelids 
should be screened to identify genetically diversified viruses po- 
tentially related to their putative bat ancestors. The putative role of 
camelids as recipients of CoVs from other mammalian hosts is 
supported by the occurrence of viruses in camelids that are closely 
related to human CoV-229E and bovine CoV (36, 37). 
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